智能论文笔记

SHRED: 3D Shape Region Decomposition with Learned Local Operations

R. Kenny Jones , Aalia Habib , Daniel Ritchie

分类：计算机视觉 | 机器学习

2022-06-07

我们提出切碎，这是一种3D形状区域分解的方法。 Shred将3D点云作为输入，并使用学习的本地操作来产生近似细粒零件实例的分割。我们将切碎的分解操作赋予了三个分解操作：分裂区域，固定区域之间的边界，并将区域合并在一起。模块经过独立和本地培训，使切碎可以为在培训过程中未见的类别生成高质量的细分。我们通过Partnet的细粒细分进行训练和评估切碎；使用其合并 - 阈值超参数，我们表明，在任何所需的分解粒度下，切碎的分割可以更好地尊重与基线方法相比，更好地尊重地面真相的注释。最后，我们证明切碎对于下游应用非常有用，在零弹药细粒的零件实例分割上的所有基准都超过了所有基准，并且当与学习标记形状区域的方法结合使用时，几乎没有发射细粒的语义分割。

translated by 谷歌翻译

The Neurally-Guided Shape Parser: Grammar-based Labeling of 3D Shape Regions with Approximate Inference

R. Kenny Jones , Aalia Habib , Rana Hanocka , Daniel Ritchie

分类：计算机视觉 | 人工智能 | 机器学习

2021-06-22

我们提出了神经引导的形状解析器（NGSP），一种方法，该方法学习如何将细粒度语义标签分配给3D形状的区域。 NGSP通过MAP推断解决了这个问题，在输入形状上建模了标签分配的后验概率，其具有学习的似然函数。为了使这次搜索易于进行，NGSP采用神经指南网络，了解近似后部。 NGSP通过使用引导网络的第一次采样提案找到高概率标签分配，然后在完全可能性下评估每个提案。我们评估NGSP从Partnet的制造3D形状的细粒度语义分割任务，其中形状被分解成对应于零件实例过分分割的区域。我们发现NGSP通过比较方法提供显着的性能改进，（i）使用区域对分组每点预测，（ii）使用区域作为自我监督信号或（iii）将标签分配给替代配方下的区域。此外，我们表明，即使具有有限的标记数据或作为形状区域经历人为腐败，NGSP即使具有有限的人为腐败，也会保持强劲的性能。最后，我们证明了NGSP可以直接应用于在线存储库中的CAD形状，并验证其效力与感知研究。

translated by 谷歌翻译

Relevance Classification of Flood-related Twitter Posts via Multiple Transformers

Wisal Mukhtiar , Waliiya Rizwan , Aneela Habib , Yasir Saleem Afridi , Laiq Hasan , Kashif Ahmad

分类：自然语言处理

2023-01-01

In recent years, social media has been widely explored as a potential source of communication and information in disasters and emergency situations. Several interesting works and case studies of disaster analytics exploring different aspects of natural disasters have been already conducted. Along with the great potential, disaster analytics comes with several challenges mainly due to the nature of social media content. In this paper, we explore one such challenge and propose a text classification framework to deal with Twitter noisy data. More specifically, we employed several transformers both individually and in combination, so as to differentiate between relevant and non-relevant Twitter posts, achieving the highest F1-score of 0.87.

translated by 谷歌翻译

Site Assessment and Layout Optimization for Rooftop Solar Energy Generation in Worldview-3 Imagery

Zeyad Awwad , Abdulaziz Alharbi , Abdulelah H. Habib , Olivier L. de Weck

分类：计算机视觉

2022-12-07

With the growth of residential rooftop PV adoption in recent decades, the problem of 1 effective layout design has become increasingly important in recent years. Although a number 2 of automated methods have been introduced, these tend to rely on simplifying assumptions and 3 heuristics to improve computational tractability. We demonstrate a fully automated layout design 4 pipeline that attempts to solve a more general formulation with greater geometric flexibility that 5 accounts for shading losses. Our approach generates rooftop areas from satellite imagery and uses 6 MINLP optimization to select panel positions, azimuth angles and tilt angles on an individual basis 7 rather than imposing any predefined layouts. Our results demonstrate that although several common 8 heuristics are often effective, they may not be universally suitable due to complications resulting 9 from geometric restrictions and shading losses. Finally, we evaluate a few specific heuristics from the 10 literature and propose a potential new rule of thumb that may help improve rooftop solar energy 11 potential when shading effects are considered.

translated by 谷歌翻译

BLOOM: A 176B-Parameter Open-Access Multilingual Language Model

Teven Le Scao , Angela Fan , Christopher Akiki , Ellie Pavlick , Suzana Ilić , Daniel Hesslow , Roman Castagné , Alexandra Sasha Luccioni , François Yvon , Matthias Gallé

分类：自然语言处理

2022-11-09

Large language models (LLMs) have been shown to be able to perform new tasks based on a few demonstrations or natural language instructions. While these capabilities have led to widespread adoption, most LLMs are developed by resource-rich organizations and are frequently kept from the public. As a step towards democratizing this powerful technology, we present BLOOM, a 176B-parameter open-access language model designed and built thanks to a collaboration of hundreds of researchers. BLOOM is a decoder-only Transformer language model that was trained on the ROOTS corpus, a dataset comprising hundreds of sources in 46 natural and 13 programming languages (59 in total). We find that BLOOM achieves competitive performance on a wide variety of benchmarks, with stronger results after undergoing multitask prompted finetuning. To facilitate future research and applications using LLMs, we publicly release our models and code under the Responsible AI License.

translated by 谷歌翻译

Modelling the Frequency of Home Deliveries: An Induced Travel Demand Contribution of Aggrandized E-shopping in Toronto during COVID-19 Pandemics

Yicong Liu , Kaili Wang , Patrick Loa , Khandker Nurul Habib

分类：机器学习

2022-09-21

19009年的大流行急剧催化了电子购物者的扩散。电子购物的急剧增长无疑会对旅行需求产生重大影响。结果，运输建模者对电子购物需求建模的能力变得越来越重要。这项研究开发了预测家庭每周送货频率的模型。我们使用经典计量经济学和机器学习技术来获得最佳模型。发现社会经济因素，例如拥有在线杂货会员资格，家庭成员的平均年龄，男性家庭成员的百分比，家庭中的工人数量以及各种土地使用因素会影响房屋送货的需求。这项研究还比较了机器学习模型和经典计量经济学模型的解释和表现。在通过机器学习和计量经济学模型确定的变量效果中找到了一致性。但是，具有相似的召回精度，有序的概率模型是一个经典的计量经济学模型，可以准确预测家庭交付需求的总分布。相反，两个机器学习模型都无法匹配观察到的分布。

translated by 谷歌翻译

An Overview of Violence Detection Techniques: Current Challenges and Future Directions

Nadia Mumtaz , Naveed Ejaz , Shabana Habib , Syed Muhammad Mohsin , Prayag Tiwari , Shahab S. Band , Neeraj Kumar

分类：计算机视觉 | 人工智能

2022-09-21

当今智能城市中产生的大型视频数据从其有目的的用法角度引起了人们的关注，其中监视摄像机等是最突出的资源，是为大量数据做出贡献的最突出的资源，使其自动化分析成为计算方面的艰巨任务。和精确。暴力检测（VD）在行动和活动识别域中广泛崩溃，用于分析大型视频数据，以了解由于人类而引起的异常动作。传统上，VD文献基于手动设计的功能，尽管开发了基于深度学习的独立模型的进步用于实时VD分析。本文重点介绍了深度序列学习方法以及检测到的暴力的本地化策略。该概述还介入了基于机器学习的初始图像处理和基于机器学习的文献及其可能具有的优势，例如针对当前复杂模型的效率。此外，讨论了数据集，以提供当前模型的分析，并用对先前方法的深入分析得出的VD域中的未来方向解释了他们的利弊。

translated by 谷歌翻译

Multimodal Information Fusion for Glaucoma and DR Classification

Yihao Li , Mostafa El Habib Daho , Pierre-Henri Conze , Hassan Al Hajj , Sophie Bonnin , Hugang Ren , Niranchana Manivannan , Stephanie Magazzeni , Ramin Tadayoni , Béatrice Cochener

分类：计算机视觉 | 机器学习

2022-09-02

多模式信息在医疗任务中经常可用。通过结合来自多个来源的信息，临床医生可以做出更准确的判断。近年来，在临床实践中使用了多种成像技术进行视网膜分析：2D眼底照片，3D光学相干断层扫描（OCT）和3D OCT血管造影等。我们的论文研究了基于深度学习的三种多模式信息融合策略，以求解视网膜视网膜分析任务：早期融合，中间融合和分层融合。常用的早期和中间融合很简单，但不能完全利用模式之间的互补信息。我们开发了一种分层融合方法，该方法着重于将网络多个维度的特征组合在一起，并探索模式之间的相关性。这些方法分别用于使用公共伽马数据集（Felcus Photophs和OCT）以及Plexelite 9000（Carl Zeis Meditec Inc.）的私人数据集，将这些方法应用于青光眼和糖尿病性视网膜病变分类。我们的分层融合方法在病例中表现最好，并为更好的临床诊断铺平了道路。

translated by 谷歌翻译

Detection of diabetic retinopathy using longitudinal self-supervised learning

Rachid Zeghlache , Pierre-Henri Conze , Mostafa El Habib Daho , Ramin Tadayoni , Pascal Massin , Béatrice Cochener , Gwenolé Quellec , Mathieu Lamard

分类：计算机视觉 | 人工智能 | 机器学习

2022-09-02

纵向成像能够捕获静态解剖结构和疾病进展的动态变化，向早期和更好的患者特异性病理学管理。但是，检测糖尿病性视网膜病（DR）的常规方法很少利用纵向信息来改善DR分析。在这项工作中，我们调查了利用纵向诊断目的的纵向性质利用自我监督学习的好处。我们比较了不同的纵向自学学习（LSSL）方法，以模拟从纵向视网膜颜色眼底照片（CFP）进行疾病进展，以便使用一对连续考试来检测早期的DR严重性变化。实验是在有或没有那些经过训练的编码器（LSSL）的纵向DR筛选数据集上进行的，该数据集充当纵向借口任务。结果对于基线（从头开始训练）的AUC为0.875，AUC为0.96（95％CI：0.9593-0.9655 DELONG测试），使用p值<2.2e-16，在早期融合上使用简单的重置式结构，使用冷冻的LSSL重量，这表明LSSL潜在空间可以编码DR进程的动态。

translated by 谷歌翻译

RealityTalk: Real-Time Speech-Driven Augmented Presentation for AR Live Storytelling

Jian Liao , Adnan Karim , Shivesh Jadon , Rubaiat Habib Kazi , Ryo Suzuki

分类：自然语言处理

2022-08-12

我们介绍RealityTalk，该系统通过语音驱动的互动虚拟元素来增强实时实时演示。增强演示文稿利用嵌入式视觉效果和动画来吸引和表现力。但是，现有的实时演示工具通常缺乏互动性和即兴创作，同时在视频编辑工具中产生这种效果需要大量的时间和专业知识。RealityTalk使用户能够通过实时语音驱动的交互创建实时增强演示文稿。用户可以通过实时语音和支持方式进行交互提示，移动和操纵图形元素。根据我们对177个现有视频编辑的增强演示文稿的分析，我们提出了一套新颖的互动技术，然后将它们纳入真人秀。我们从主持人的角度评估我们的工具，以证明系统的有效性。

translated by 谷歌翻译